AITopics | deterministic markov decision process

Collaborating Authors

deterministic markov decision process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Lower Bound on Howard Policy Iteration for Deterministic Markov Decision Processes

Asadi, Ali, Chatterjee, Krishnendu, de Raaij, Jakob

arXiv.org Artificial IntelligenceJun-17-2025

Deterministic Markov Decision Processes (DMDPs) are a mathematical framework for decision-making where the outcomes and future possible actions are deterministically determined by the current action taken. DMDPs can be viewed as a finite directed weighted graph, where in each step, the controller chooses an outgoing edge. An objective is a measurable function on runs (or infinite trajectories) of the DMDP, and the value for an objective is the maximal cumulative reward (or weight) that the controller can guarantee. We consider the classical mean-payoff (aka limit-average) objective, which is a basic and fundamental objective. Howard's policy iteration algorithm is a popular method for solving DMDPs with mean-payoff objectives. Although Howard's algorithm performs well in practice, as experimental studies suggested, the best known upper bound is exponential and the current known lower bound is as follows: For the input size $I$, the algorithm requires $\tildeΩ(\sqrt{I})$ iterations, where $\tildeΩ$ hides the poly-logarithmic factors, i.e., the current lower bound on iterations is sub-linear with respect to the input size. Our main result is an improved lower bound for this fundamental algorithm where we show that for the input size $I$, the algorithm requires $\tildeΩ(I)$ iterations.

artificial intelligence, iteration, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2506.12254

Country: Europe (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

Add feedback

Speed Optimization Algorithm based on Deterministic Markov Decision Process for Automated Highway Merge

Goto, Takeru, Toda, Kosuke, Kumano, Takayasu

arXiv.org Artificial IntelligenceMar-19-2025

This study presents a robust optimization algorithm for automated highway merge. The merging scenario is one of the challenging scenes in automated driving, because it requires adjusting ego vehicle's speed to match other vehicles before reaching the end point. Then, we model the speed planning problem as a deterministic Markov decision process. The proposed scheme is able to compute each state value of the process and reliably derive the optimal sequence of actions. In our approach, we adopt jerk as the action of the process to prevent a sudden change of acceleration. However, since this expands the state space, we also consider ways to achieve a real-time operation. We compared our scheme with a simple algorithm with the Intelligent Driver Model. We not only evaluated the scheme in a simulation environment but also conduct a real world testing.

artificial intelligence, machine learning, vehicle, (16 more...)

arXiv.org Artificial Intelligence

2503.14899

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (0.40)

Industry: Transportation > Ground > Road (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)

Add feedback

Max-Plus Matching Pursuit for Deterministic Markov Decision Processes

Bach, Francis

arXiv.org Machine LearningJun-20-2019

We consider deterministic Markov decision processes (MDPs) and apply max-plus algebra tools to approximate the value iteration algorithm by a smaller-dimensional iteration based on a representation on dictionaries of value functions. The setup naturally leads to novel theoretical results which are simply formulated due to the max-plus algebra structure. For example, when considering a fixed (non adaptive) finite basis, the computational complexity of approximating the optimal value function is not directly related to the number of states, but to notions of covering numbers of the state space. In order to break the curse of dimensionality in factored state-spaces, we consider adaptive basis that can adapt to particular problems leading to an algorithm similar to matching pursuit from signal processing. They currently come with no theoretical guarantees but work empirically well on simple deterministic MDPs derived from low-dimensional continuous control problems. We focus primarily on deterministic MDPs but note that the framework can be applied to all MDPs by considering measure-based formulations.

artificial intelligence, deterministic markov decision process, machine learning, (1 more...)

arXiv.org Machine Learning

1906.08524

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.60)

Add feedback